2,347 research outputs found
Straight to Shapes: Real-time Detection of Encoded Shapes
Current object detection approaches predict bounding boxes, but these provide
little instance-specific information beyond location, scale and aspect ratio.
In this work, we propose to directly regress to objects' shapes in addition to
their bounding boxes and categories. It is crucial to find an appropriate shape
representation that is compact and decodable, and in which objects can be
compared for higher-order concepts such as view similarity, pose variation and
occlusion. To achieve this, we use a denoising convolutional auto-encoder to
establish an embedding space, and place the decoder after a fast end-to-end
network trained to regress directly to the encoded shape vectors. This yields
what to the best of our knowledge is the first real-time shape prediction
network, running at ~35 FPS on a high-end desktop. With higher-order shape
reasoning well-integrated into the network pipeline, the network shows the
useful practical quality of generalising to unseen categories similar to the
ones in the training set, something that most existing approaches fail to
handle.Comment: 16 pages including appendix; Published at CVPR 201
Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos
In this work, we propose an approach to the spatiotemporal localisation
(detection) and classification of multiple concurrent actions within temporally
untrimmed videos. Our framework is composed of three stages. In stage 1,
appearance and motion detection networks are employed to localise and score
actions from colour images and optical flow. In stage 2, the appearance network
detections are boosted by combining them with the motion detection scores, in
proportion to their respective spatial overlap. In stage 3, sequences of
detection boxes most likely to be associated with a single action instance,
called action tubes, are constructed by solving two energy maximisation
problems via dynamic programming. While in the first pass, action paths
spanning the whole video are built by linking detection boxes over time using
their class-specific scores and their spatial overlap, in the second pass,
temporal trimming is performed by ensuring label consistency for all
constituting detection boxes. We demonstrate the performance of our algorithm
on the challenging UCF101, J-HMDB-21 and LIRIS-HARL datasets, achieving new
state-of-the-art results across the board and significantly increasing
detection speed at test time. We achieve a huge leap forward in action
detection performance and report a 20% and 11% gain in mAP (mean average
precision) on UCF-101 and J-HMDB-21 datasets respectively when compared to the
state-of-the-art.Comment: Accepted by British Machine Vision Conference 201
Optical pulse propagation in a switched-on photonic lattice: Rabi effect with the roles of light and matter interchanged
A light pulse propagating in a suddenly switched on photonic lattice, when
the central frequency lies in the photonic band gap, is an analog of the Rabi
model where the two-level system is the two resonant (i.e. Bragg-coupled)
Fourier modes of the pulse, while the photonic lattice serves as a
monochromatic external field. A simple theory of these Rabi oscillations is
given and confirmed by the numerical solution of the corresponding Maxwell
equations. This is a direct, i.e. temporal, analog of the Rabi effect,
additionally to the spatial analog in optical beam propagation described in
Opt. Lett. 32, 1920 (2007). An additional high-frequency modulation of the Rabi
oscillations reflects the lattice-induced energy transfer between the electric
and magnetic fields of the pulse.Comment: 3 pages, 5 figure
InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure
Volumetric models have become a popular representation for 3D scenes in
recent years. One breakthrough leading to their popularity was KinectFusion,
which focuses on 3D reconstruction using RGB-D sensors. However, monocular SLAM
has since also been tackled with very similar approaches. Representing the
reconstruction volumetrically as a TSDF leads to most of the simplicity and
efficiency that can be achieved with GPU implementations of these systems.
However, this representation is memory-intensive and limits applicability to
small-scale reconstructions. Several avenues have been explored to overcome
this. With the aim of summarizing them and providing for a fast, flexible 3D
reconstruction pipeline, we propose a new, unifying framework called InfiniTAM.
The idea is that steps like camera tracking, scene representation and
integration of new data can easily be replaced and adapted to the user's needs.
This report describes the technical implementation details of InfiniTAM v3,
the third version of our InfiniTAM system. We have added various new features,
as well as making numerous enhancements to the low-level code that
significantly improve our camera tracking performance. The new features that we
expect to be of most interest are (i) a robust camera tracking module; (ii) an
implementation of Glocker et al.'s keyframe-based random ferns camera
relocaliser; (iii) a novel approach to globally-consistent TSDF-based
reconstruction, based on dividing the scene into rigid submaps and optimising
the relative poses between them; and (iv) an implementation of Keller et al.'s
surfel-based reconstruction approach.Comment: This article largely supersedes arxiv:1410.0925 (it describes version
3 of the InfiniTAM framework
OCVD Measurement of Ambipolar and Minority Carrier Lifetime in 4H-SiC Devices: Relevance of the Measurement Setup
The open-circuit voltage decay (OCVD) method is a well-known technique for conducting electrical measurements of carrier lifetime: the main advantages lie in the simple setup and the possibility of carrying out measurements in commercial devices without the need of removing the package, as for optical methods. Despite several researchers having reported carrier lifetimes measured by the OCVD method in different devices, there has been little discussion about the potential effect of the experimental setup on the obtained results. By comparing the outputs of the experimental measurements with those of numerical simulations, this study investigates the overlooked effect of the OCVD measurement setup on the former. Due to the growing importance of SiC-based devices, the analysis is applied to a 4H-SiC p-i-n diode. Two main points are addressed: 1) the effect of circuit setup on the ambipolar lifetime is discussed and a method, originally developed for improving the estimate of low-level carrier lifetime in OCVD measurements, is used to correct the measured lifetime for this influence; 2) the origin of the local minimum eventually appearing in the lifetime versus time curves is also investigated. It is found that the minimum can also be related to the time constant of the experimental setup, giving rise to doubts about the usual interpretation of this minimum as the minority carrier lifetime. A method is thus proposed to help discriminate between the two interpretations
Objective-free excitation of quantum emitters with a laser-written micro parabolic mirror
The efficient excitation of quantum sources such as quantum dots or single molecules requires high NA optics which is often a challenge in cryogenics, or in ultrafast optics. Here we propose a 3.2 um wide parabolic mirror, with a 0.8 um focal length, fabricated by direct laser writing on CdSe/CdS colloidal quantum dots, capable of focusing the excitation light to a sub-wavelength spot and to extract the generated emission by collimating it into a narrow beam. This mirror is fabricated via in-situ volumetric optical lithography, which can be aligned to individual emitters, and it can be easily adapted to other geometries beyond the paraboloid. This compact solid-state transducer from far-field to the emitter has important applications in objective-free quantum technologies
- …